Unit 3 topics (that we’ve covered thus far)

Week 10 - Sampling distributions and confidence intervals

Effect size

Practical vs statistical significance

Week 11 - Hypothesis tests for an unknown proportion or mean

Types of conclusions

The only way to reduce both types of error is to collect more evidence or, in statistical terms, to collect more data.

  • \(\alpha = Pr(\text{Type I error})\): If \(H_0\) is true, this is the probability that we (incorrectly) reject it.

  • \(\beta = Pr(\text{Type II error})\): If \(H_0\) is false, this is the probability that we (incorrectly) fail to reject it.

  • \(1-\beta = Power\) If \(H_0\) is false, this is the probability that we (correctly) reject it.

Week 12 - Inference from two samples (grouped data)

Example: Confidence interval for a difference in means (from Week 12)

On average, how much more money do consumers spend at Target compared to Walmart?

Suppose researchers collected a systematic sample from \(85\) Walmart customers and \(80\) Target customers by asking them for their purchase amount as they left the stores. The data they collected is summarized in the table below. Suppose a computer already calculated the degrees of freedom to be \(162.75\).

Walmart Target
\(\bar{x}\) \(\$45\) \(\$53\)
s \(\$21\) \(\$19\)

Step 1) Identify and define the population parameter and choose your confidence level.

Step 2) Calculate the sample estimate for the population parameter.

Step 3) Assess the required assumptions and conditions.

Step 4) Find the critical value corresponding to your confidence level.

Step 5) Calculate the standard error of your sample estimate.

Step 6) Calculate the lower and upper bounds of your confidence interval.


Example: Confidence interval for a mean difference of paired data (from Week 13)

On average, how large is the difference in car insurance prices for customers of an online insurance company versus customers of a local insurance company?

Find a \(95\%\) confidence interval for the mean difference in insurance prices based on the data given below.

mean(insurance_diff$PriceDiff)
## [1] 45.9
sd(insurance_diff$PriceDiff)
## [1] 175.6628
t.test(insurance_diff$PriceDiff, mu=0, conf.level=0.95)
## 
##  One Sample t-test
## 
## data:  insurance_diff$PriceDiff
## t = 0.82629, df = 9, p-value = 0.43
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##  -79.76163 171.56163
## sample estimates:
## mean of x 
##      45.9

Looking ahead